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3.5 Consistency of approximate M-estimators of type. As in Sec. 3.3, let 

{X, A, P) be a probability space and a locally compact separable metric space. Let 
'ifj{9,x) be a function of x in X and 9 E & with values in a Euclidean space M"*. Let 
Xi,X2, ... be independent with values in X and distribution P. A sequence of estimators 
Tn '■= Tn{Xi,... ,Xn) with values in G will be called approximate M-estimators of iIj 
type if 

(3.5.1) ^ 127=1 '^i'^ni Xi) almost uniformly as n ^ oo. 

If i/j is jointly measurable, as will follow from assumptions to be given, then since estimators 
Tn by definition are assumed to be statistics (measurable functions of the observations), 
the almost uniform convergence in (3.5.1) will be equivalent to almost sure convergence. 
Recall that if T„ are M-estimators of ip type, the expression on the left in (3.5.1) equals 0, 
at least with probabilities converging to 1. Convergence of T„ to some 9o will be proved 
under some assumptions as follows. 

(B-1) For each 9 E the function 'i/'(6', ■) is ^-measurable. 
(B-2) For almost all x, il){-,x) is continuous on 0. 

(B-3) \{9) := Ei/j{9, ■) is defined and finite for all 9, and for some 6*0, A(^o) = 0, while 

A(^) ^ for all 9 9q. 
(B-4) There is a continuous, positive function 6(-) on 0, bounded away from 0, so that for 

some bo > 0, b{9) > bo for all 9, and 

(i) '^{x) := sup^i |'i/'(^,a;)|/6(^) is integrable, 

(ii) liminfe^oo \Xi9)\/b(9) > 1, and 

(iii) E{\imsupe^^\i;i9,x) - Xi9)\/bi9)} < 1. 

A first question about the assumptions is: how are we to verify them, given that the 
true distribution P of the observations is unknown? (B-1) and (B-2) don't depend on P, 
so they can be checked. In (B-3), i^{9, ■) will be integrable for all P and 9 if it is a bounded 
function of x for each 9. If -0 is bounded uniformly in x and 9, as for the classes of ip 
functions with —A < ip{9,x) < A < +oo considered in the 1-dimensional location case, so 
much the better. 

To verify that there is unique 6*0 with A (6*0) = is not as easy, but if if^ has some 
strict monotonicity property (or multidimensional extensions of such a property) it may 
be possible to show that this is true for all P. It may also be that existence of a pseudo- 
true ^0 is a restriction on P, for example, in the case of the median, that P has to have a 
unique median (although actually (B-2) doesn't hold for the ip function corresponding to 
the median). 

For (B-4)(i), \l/(x) will be integrable for all P if and only if it is bounded. For this, 
it's sufficient that 11^(9, x) be bounded uniformly in 9 and x. 

For (B-4)(ii), if ^ e M and ip is real- valued, one way to ensure the condition for all P 
is that for all x, 

liminf V'(6',a;)/&(6') > 1 and liminfV'(6',a;)/&(^) < -1. 
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In higher dimensions, the situation is more comphcated because instead of just two direc- 
tions for going to infinity there are infinitely many, but on the other hand for ip vector 
valued, it will tend to be small or zero less often. 

(B-4)(iii) will hold if limsupg,^oo \il^{0,x) - \{9)\/h{9) < 1 for aU x, but this may stiU 
not be straightforward to check since X{9) depends on the unknown P. 

A diflficulty about (B-4) is that parts (i) and (iii) require b{9) to be not too small, 
whereas part (ii) requires it to be not too large. 

Some other comments on the assumptions: recall that one way t/j functions commonly 
arise is as the gradients with respect to 6* of p functions. If so, then (B-2) implies that for 
almost all x, is a function of 9, as the narrow-sense Huber functions are. (B-1) 

and (B-2) imply that V' is separable, as in (A-1) of Sec. 3.3, with 5" any countable dense 
subset of and A := {x : ?/'(■, x) is not continuous }. (B-1), (B-2) and the integrability in 
(B-3) are mild regularity conditions. If there were 9 ^ (p with A(^) = A((^) = 0, then (3.5.1) 
could hold by the law of large numbers when is either near cf) or near 6*, so T„ would 
not necessarily converge. Thus A having a unique zero at some 9o is a natural assumption 
for consistency, specifically T„ 9q. (B-4) is the most technical, least intuitive of the 
assumptions. 

Assumption (B-4(i)) gives \iIj{9,x)\ < '^{x)h{9) for an integrable function If U is 
a neighborhood of 9 whose closure is compact, then 6(-), being continuous, is bounded on 
U. It follows that 

su^{\il){9,x) -tl){(p,x)\: (t)eU}) < 2^(x) sup{6((/)) : G C/}, 

an integrable function. This, assumption (B-2), and dominated convergence imply: 
(B-2') For any 9, as a neighborhood U oi 9 converges to {^}, 

E{sM-p{\'^{9,x)-'^{^,x)\: (f)eU}) 0. 

Then, it follows that A(-) as defined in (B-3) is continuous. 

The next fact is not needed for the proof of consistency (Theorem 3.5.4 below) but it 
may be useful in checking hypothesis (B-4) by suggesting what function(s) to use for b{9), 
if we can control A(^) well enough without knowing P. 

3.5.2 Proposition. If (B-1) through (B-4) all hold, for some b{9) and 6o, then (B- 
4) also holds for S(^) := max(|A(6')|, 6o) or Si(^) := max(|A(6')|, 6[)) where 6[) := 
liminf<^^oo |A((/>)| in place of b{9). 

Proof. Clearly, B{-) is continuous and > 60- From (ii) for b{9) and b{9) > bo, we have 
^0 — ^0) so (ii) holds for -B(-). Also, (ii) for b{-) implies that for any e > 0, there is a 
compact K such that for 9 ^ K, b{9) < (1 -|-£)|A(^)|. For e small enough, this implies (iii) 
for -B(-), and also (i) for the supremum over the complement of K. For the supremum over 
K, (i) is equivalent for any two positive continuous functions, such as &(•) and S(-), both 
bounded away from 0. 

Since Bi > B, clearly (i) and (iii) hold for Bi. Also, from the definitions, (ii) holds 
for Bi. □ 

Assuming (B-1), (B-2) and (B-3), we have that (B-4) holds for some function b{-) if 
and only if both b'o > and (B-4) holds for -Bi(^) in place of b{9) by Proposition 3.5.2. 
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Still, the function A(-) depends on the law P which usually is unknown to the statistician. 
Thus the assumptions would usually need to be checked for all P in some class (which may 
or may not be parametrized by 0). 

3.5.3 Lemma. If (B-1) and (B-4) hold, then for any sequence {T^} of approximate M- 
estimators of ip type there is a compact set C C such that & C eventually a.s., 
specifically (3.3.11) holds. 

Proof. For a compact set C, let 

wcix) := sup{\i(;{e,x)-X{e)\/b{e): O^C}. 

By (B-4) (i) and (iii) and dominated convergence, since < wcix) < \l/(a;) + for any 
C, we can take C large enough so that Ewc{x) < 1. Then we can take e > small 
enough so that Ewc < 1 — 36. Note that if C C -D for another compact set D, we have 
wd{x) < wc{x) for all x and so Ewd < Ewc- Thus by (B-4(ii)), we can take the compact 
C large enough so that |A(6')| > (1 - £)h{9) for 9 ^C, and Ewc < 1 - 3£ still holds. 
By the strong law of large numbers for w{-), a.s. for n large enough 

sup{-Y,m.Xi)-X{e)\/h{d): eiC} < -J]«;(X,) < l-2e, 

so for 9 not in C, 

hT.Um,X,)-m\ < {l-2e)h{9) < {l-2e)\m\/{l-e) < (1-£)|A(^)|, 

so |^E^=lV'(^,^^)| > £|A(^)| > e{l-e)bo > 0. 
This implies (3.3.11). □ 

3.5.4 Theorem. Let {T„} be a sequence of approximate M-estimators of ip type. If 

(a) (B-1), (B-2), (B-3) and (B-4) hold, 
or if 

(b) (B-1), (B-2') and (B-3) hold, and (3.3.11) holds for some compact C, 
then Tn 9q almost uniformly. 

Proof. Hypotheses (a) imply (B-2') as noted at its statement, and (3.3.11) by Lemma 
3.5.3, so we can assume hypotheses (b). Then, we can assume that G is the compact set 
C in (3.3.11). By (B-2'), A(-) is continuous. Let U be any open neighborhood of ^o- Then 
on the compact set C \ C/, A is strictly positive by (B-3) and attains its infimum, which is 
> 55 for some (5 > 0. For each 9 e C\U, take a neighborhood Uq by (B-2') such that 

(3.5.5) E{snp{\tlj{(i),x) -t(j{9,x)\: (j) e Ue}) < S. 

Then \X{4>) — X{9)\ < d for (/) E Ug. Since C \ U is compact, take a finite subcover 
Uj :— Ug. for some M < oo and j — 1, . . . , M. Then 

1 " 

S := sup{-|^V'(</',^i)-A((/>)| : (l)eC\U} < T1+T2+T3 
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where Ti := maxi<j<M ^ Yh=\ sup{|V'(</>, Xj) - '^{9j,Xi)\ : e Uj}, 

1 

T2 := max -| ^ - A(^,)|, 

and T3 := maxj sup{|A((/>) — ■ 4> G Uj}. Then T3 < 5 by (3.5.5) since for each 

(f> e t/j, 

|A(0)-A(^,-)l = |£;V'(<^,a;)-£;V(^i,^)l < 

Almost surely for n large enough, Ti < 25 by (3.5.5) applied to 6* = 9j and the strong 
law of large numbers M times; also, T2 < 5 hy (B-3) and the strong law of large numbers 
M times, once for each 6j. Then S < 25 + 5 + 5 = 45. But |A(0)| > 55 for e C \ [/ 
implies that ^| Xir=i ^^(05 ^ ^ fo^^ (p E C\U and n large enough, which implies Tn E U 
eventually a.s., specifically 1t„€J7 ~^ 1 almost uniformly. So — > 6*0 almost uniformly. □ 

PROBLEMS 

1. Consider ■il;{9,x) — p'{x — 9) for p equal to wide-sense Huber function (b) on p. 6 of 
section 3.4, p{x) = (c^ + a;^)^/^ for some c > 0. Take c = 1. Verify that in this section, 
conditions (B-1) through (B-4) all hold for any law P. Hints: for (B-3), show that A'(^) < 
for all 9, and find limits of A(^) as ^ — > —00 or -|-oo. 

2. Consider the narrow-sense Hubcr functions, (c) on p. 6 of section 3.4, where for some 
6 > 0, p{x) = x"^ for \x\ < b and c\x\ + d otherwise, where c and d are chosen to make p 
a function. Show that in this case there is always a ^0 such that A(^o) = (by the 
intermediate value theorem: show that A(-) is continuous, positive at some 9 and negative 
and some other 9). Show however that if a law P has an interval of medians longer than 
26, in other words its distribution function F{x) — P{{—oo,x]) is equal to 1/2 on such an 
interval, then A(^) is not at a unique point ^0 but is zero on some interval (it, v) with 
u <v. 

NOTES 

This section is based on the paper by Huber (1967), pp. 224-226. 
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